StarCraft 2 Replay Data Exploration by Benjamin Xiao

Introduction

StarCraft 2 is a sequel to the ever popular real time strategy game, StarCraft Brood War. In its competitive form, two players face each other on various maps choosing one of three available races or a random choice. The database used can be downloaded here: https://www.kaggle.com/sfu-summit/starcraft-ii-replay-analysis and the original study of this dataset is at http://summit.sfu.ca/item/13328

I am exploring this dataset to look mehanical differences among players of different skill levels. The skill levels are broadly placed into bins from 1 to 8, 8 being the most skilled and smallest percentage of the sample.

Each variable in the dataset is hypothesized to be correlated with higher skill. For instance, better players will make and be able to use more complex units. Lower level players generally play with lower APM and utilize less mechanical tools available (hot keys and mini map functions). Most variables are divided by the time elapsed to generate a standard for variables that naturally acculumate over time.

This data is also a snapshot of the overall skill levels for players at the time of data collection and for the specific expansion called Wings of Liberty. The explanation of league placement is shown here: http://wiki.teamliquid.net/starcraft2/Battle.net_Leagues

My main question is which variables affect skill level most? Other questions include what habits separates players in top placements? This can provide insight into where players at each skill level hit skill walls and must work deliberately in order to improve and move into higher league placements.

Univariate Plots Section

##      GameID       LeagueIndex         Age         HoursPerWeek   
##  Min.   :   52   Min.   :1.000   Min.   :16.00   Min.   :  0.00  
##  1st Qu.: 2464   1st Qu.:3.000   1st Qu.:19.00   1st Qu.:  8.00  
##  Median : 4874   Median :4.000   Median :21.00   Median : 12.00  
##  Mean   : 4805   Mean   :4.184   Mean   :21.65   Mean   : 15.91  
##  3rd Qu.: 7108   3rd Qu.:5.000   3rd Qu.:24.00   3rd Qu.: 20.00  
##  Max.   :10095   Max.   :8.000   Max.   :44.00   Max.   :168.00  
##                                  NA's   :55      NA's   :56      
##    TotalHours             APM         SelectByHotkeys   
##  Min.   :      3.0   Min.   : 22.06   Min.   :0.000000  
##  1st Qu.:    300.0   1st Qu.: 79.90   1st Qu.:0.001258  
##  Median :    500.0   Median :108.01   Median :0.002500  
##  Mean   :    960.4   Mean   :117.05   Mean   :0.004299  
##  3rd Qu.:    800.0   3rd Qu.:142.79   3rd Qu.:0.005133  
##  Max.   :1000000.0   Max.   :389.83   Max.   :0.043088  
##  NA's   :57                                             
##  AssignToHotkeys     UniqueHotkeys       MinimapAttacks     
##  Min.   :0.0000000   Min.   :0.000e+00   Min.   :0.000e+00  
##  1st Qu.:0.0002042   1st Qu.:3.275e-05   1st Qu.:0.000e+00  
##  Median :0.0003526   Median :5.340e-05   Median :3.990e-05  
##  Mean   :0.0003736   Mean   :5.873e-05   Mean   :9.831e-05  
##  3rd Qu.:0.0004988   3rd Qu.:7.865e-05   3rd Qu.:1.189e-04  
##  Max.   :0.0017522   Max.   :3.376e-04   Max.   :3.019e-03  
##                                                             
##  MinimapRightClicks   NumberOfPACs      GapBetweenPACs    ActionLatency   
##  Min.   :0.0000000   Min.   :0.000679   Min.   :  6.667   Min.   : 24.09  
##  1st Qu.:0.0001401   1st Qu.:0.002754   1st Qu.: 28.958   1st Qu.: 50.45  
##  Median :0.0002815   Median :0.003395   Median : 36.724   Median : 60.93  
##  Mean   :0.0003874   Mean   :0.003463   Mean   : 40.362   Mean   : 63.74  
##  3rd Qu.:0.0005141   3rd Qu.:0.004027   3rd Qu.: 48.291   3rd Qu.: 73.68  
##  Max.   :0.0040408   Max.   :0.007971   Max.   :237.143   Max.   :176.37  
##                                                                           
##   ActionsInPAC    TotalMapExplored     WorkersMade       
##  Min.   : 2.039   Min.   :0.0000913   Min.   :0.0000770  
##  1st Qu.: 4.273   1st Qu.:0.0002244   1st Qu.:0.0006830  
##  Median : 5.096   Median :0.0002695   Median :0.0009052  
##  Mean   : 5.273   Mean   :0.0002825   Mean   :0.0010317  
##  3rd Qu.: 6.034   3rd Qu.:0.0003253   3rd Qu.:0.0012587  
##  Max.   :18.558   Max.   :0.0008319   Max.   :0.0051493  
##                                                          
##  UniqueUnitsMade     ComplexUnitsMade    ComplexAbilityUsed 
##  Min.   :1.970e-05   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:6.780e-05   1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :8.220e-05   Median :0.000e+00   Median :0.0000203  
##  Mean   :8.455e-05   Mean   :5.943e-05   Mean   :0.0001419  
##  3rd Qu.:9.860e-05   3rd Qu.:8.555e-05   3rd Qu.:0.0001814  
##  Max.   :2.019e-04   Max.   :9.023e-04   Max.   :0.0030837  
##                                                             
##   MaxTimeStamp    WorkersCreatedTotal
##  Min.   : 25224   Min.   :  2.00     
##  1st Qu.: 60090   1st Qu.: 50.00     
##  Median : 81012   Median : 74.00     
##  Mean   : 83598   Mean   : 83.91     
##  3rd Qu.:102074   3rd Qu.:107.00     
##  Max.   :388032   Max.   :450.00     
## 

## 
##   1   2   3   4   5   6   7   8 
## 167 347 553 811 806 621  35  55

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   16.00   19.00   21.00   21.65   24.00   44.00      55
## 
##  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33 
## 256 248 325 313 357 344 314 259 225 168 136 111  73  52  32  29  21  15 
##  34  35  36  37  38  39  40  41  43  44 
##  15  17   8   5   5   3   4   3   1   1

These are counts of the contributing players to this dataset. Technically, professional and grandmaster players are both in the grandmaster category, but the community agrees pro players are in a different class.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    0.00    8.00   12.00   15.91   20.00  168.00      56

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0     8.0    12.0    15.8    20.0    98.0

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
##       3.0     300.0     500.0     960.4     800.0 1000000.0        57

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     3.0   300.0   500.0   642.7   800.0 10260.0

The hours spent playing StarCraft 2 is left skewed with many outliers. Zooming into hour per week, we can see most players play between 0 to 35 hours a week. Zooming into total hours, most players are in the 0 to 1500 hours categories. The higher end of the total hours played is just impossible, so I subsetted the data in the final summary.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   22.06   79.90  108.01  117.05  142.79  389.83

The majority of APM values are somewhere between 50 and 200. Zooming in, we get a clearer view of that most players do play within the 80 and 150 APM range. I chose to display all the APM values. Although they are ridiculous values, they are possible and people do play super fast.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   6.667  28.958  36.724  40.362  48.291 237.143

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0000913 0.0002244 0.0002695 0.0002825 0.0003253 0.0008319

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0000770 0.0006830 0.0009052 0.0010317 0.0012587 0.0051493

These are general, singular measurements of different skill levels. Of course, these are all naturally correlated with APM, so it makes sense that the distributions are similar to the APM histogram. Since these are all standardized based on a time period, it also makes sense that they have similar distributions based on this normalization.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.00   50.00   74.00   83.91  107.00  450.00

Univariate Analysis

What is the structure of your dataset?

There are 3395 contributing players to this dataset with one general skill categorization (LeagueIndex), player information (Age, HoursPerWeek, TotalHours), and general measurements of skill.

LeagueIndex goes from 1 to 8. 1 being the worst and 8 being you should go compete in tournaments.

Observations:

  • Most contributing users are in the top half of the player pool according to the league placement distribution.

  • Mean APM among these players is 117.05 which is a little low from my expectations.

  • All the variables that have been divided by time are very similarly distributed.

  • The max value in TotalHours is 1000000 hours which is definitely an error because that is physically impossible. The game was released in July 2010, and the data was collected in 2013. This is probably just some people being funny.

What is/are the main feature(s) of interest in your dataset?

I’d like to primarily see which measure or collection of measurements characterize skill best. I suspect all PAC data is the most telling and GapBetweenPACs to be the difference before players achieve GrandMaster or higher placement. Secondarily, I’d like to explore the relationship with time spent playing to help debunk the idea that ability in StarCraft 2 is fixed.

What other features in the dataset do you think will help support your  

investigation into your feature(s) of interest?

I’m guessing hot key usage is the next best predictor of skill. Habitually using your hot keys allows a player to play the game more fluidly, faster, and manage various aspects of the game simultaenously. Neglecting hot keys, particularly the SelectByHotkeys significantly hinders overall gameplay.

Did you create any new variables from existing variables in the dataset?

I created a total workers made variable by multiplying WorkersMade and MaxTimeStamp. Since there is no such thing as a fraction of a worker, I used the round() function to make everything integers instead.

Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form
of the data? If so, why did you do this?

I subsetted TotalHours and HoursPerWeek variables to eliminate the impossible values such as playing for 100000 when the game has not been released for that long. I subsetted HoursPerWeek to values less than 112 hours because there are only 168 hours in a week, and I’m assuming a third of those hours are spent sleeping or doing other things besides playing StarCraft 2.

Just looking through the data, players with LeagueIndex 8, the professional players, don’t have data for age or time.

Bivariate Plots Section

##                      LeagueIndex          Age HoursPerWeek    TotalHours
## LeagueIndex          1.000000000 -0.127517857   0.21792962  0.0238835076
## Age                 -0.127517857  1.000000000  -0.18443135 -0.0166302848
## HoursPerWeek         0.217929622 -0.184431348   1.00000000  0.0243063070
## TotalHours           0.023883508 -0.016630285   0.02430631  1.0000000000
## APM                  0.624171097 -0.210724021   0.24689710  0.0728500192
## SelectByHotkeys      0.428636748 -0.131105130   0.20579209  0.0818295456
## AssignToHotkeys      0.487279182 -0.104960182   0.15831472  0.0424145994
## UniqueHotkeys        0.260881605 -0.004889105   0.06227475  0.0079529503
## MinimapAttacks       0.270524533  0.043098035   0.08410429  0.0008742143
## MinimapRightClicks   0.206379509 -0.019903780   0.04949846  0.0076621610
## NumberOfPACs         0.589193503 -0.197127662   0.17487149  0.0395768410
## GapBetweenPACs      -0.537535661  0.112106116  -0.13383771 -0.0206443484
## ActionLatency       -0.659940252  0.240240076  -0.18873469 -0.0357000612
## ActionsInPAC         0.140302896 -0.045892563   0.09527073  0.0107035499
## TotalMapExplored     0.227717341 -0.090467041   0.06116076  0.0263622278
## WorkersMade          0.310451517 -0.092291227   0.05067888  0.0148278674
## UniqueUnitsMade      0.115625814 -0.027018592   0.03138682 -0.0009934735
## ComplexUnitsMade     0.171188405 -0.080267568   0.05928513 -0.0071538033
## ComplexAbilityUsed   0.156033303 -0.065602477   0.07471356 -0.0063108632
## MaxTimeStamp         0.007919558  0.043063714   0.00668026 -0.0045556234
## WorkersCreatedTotal  0.279704822 -0.057288202   0.05110010  0.0096910781
## UniqueHotkeysTot     0.322414631  0.015119375   0.07026197  0.0093182479
## ComplexUnitsMadeTot  0.156324642 -0.069733538   0.05689846 -0.0064590994
##                              APM SelectByHotkeys AssignToHotkeys
## LeagueIndex          0.624171097      0.42863675      0.48727918
## Age                 -0.210724021     -0.13110513     -0.10496018
## HoursPerWeek         0.246897099      0.20579209      0.15831472
## TotalHours           0.072850019      0.08182955      0.04241460
## APM                  1.000000000      0.81462419      0.53413307
## SelectByHotkeys      0.814624190      1.00000000      0.45034207
## AssignToHotkeys      0.534133067      0.45034207      1.00000000
## UniqueHotkeys        0.285059318      0.29165476      0.31945696
## MinimapAttacks       0.218562179      0.13272262      0.20543734
## MinimapRightClicks   0.306390624      0.10761412      0.15499866
## NumberOfPACs         0.635248382      0.36005682      0.45448065
## GapBetweenPACs      -0.567395627     -0.27376720     -0.37792433
## ActionLatency       -0.722253167     -0.39001250     -0.46149571
## ActionsInPAC         0.402928112      0.16696415      0.09150689
## TotalMapExplored     0.262353189      0.21405422      0.16756499
## WorkersMade          0.377718832      0.16140363      0.19701165
## UniqueUnitsMade      0.107463910      0.12433030      0.07821519
## ComplexUnitsMade     0.161772514      0.06546729      0.16984979
## ComplexAbilityUsed   0.141060512      0.06372039      0.16904834
## MaxTimeStamp        -0.006738536     -0.07638454      0.01417986
## WorkersCreatedTotal  0.334293446      0.09290693      0.20057325
## UniqueHotkeysTot     0.335985578      0.27403026      0.40255596
## ComplexUnitsMadeTot  0.139713657      0.04823870      0.15383112
##                     UniqueHotkeys MinimapAttacks MinimapRightClicks
## LeagueIndex           0.260881605   0.2705245332        0.206379509
## Age                  -0.004889105   0.0430980350       -0.019903780
## HoursPerWeek          0.062274751   0.0841042859        0.049498458
## TotalHours            0.007952950   0.0008742143        0.007662161
## APM                   0.285059318   0.2185621795        0.306390624
## SelectByHotkeys       0.291654757   0.1327226161        0.107614118
## AssignToHotkeys       0.319456963   0.2054373354        0.154998663
## UniqueHotkeys         1.000000000   0.0576324412        0.067942524
## MinimapAttacks        0.057632441   1.0000000000        0.224683035
## MinimapRightClicks    0.067942524   0.2246830349        1.000000000
## NumberOfPACs          0.102877258   0.1377446447        0.143537107
## GapBetweenPACs       -0.194726320  -0.2133273106       -0.244541960
## ActionLatency        -0.172779913  -0.1714623341       -0.216642792
## ActionsInPAC          0.141812004   0.1337114589        0.323706244
## TotalMapExplored      0.378213997   0.0214276135        0.115942799
## WorkersMade           0.130595901   0.0822811924        0.212464470
## UniqueUnitsMade       0.392893539  -0.0405750171        0.069620611
## ComplexUnitsMade     -0.110082288   0.0522298528        0.097993358
## ComplexAbilityUsed   -0.088265345   0.0422672761        0.095658935
## MaxTimeStamp         -0.421191908   0.1003636025        0.055295913
## WorkersCreatedTotal  -0.150297289   0.1399495455        0.232162687
## UniqueHotkeysTot      0.703174070   0.1510283355        0.124570093
## ComplexUnitsMadeTot  -0.143863720   0.0626391069        0.093137523
##                     NumberOfPACs GapBetweenPACs ActionLatency ActionsInPAC
## LeagueIndex           0.58919350    -0.53753566   -0.65994025   0.14030290
## Age                  -0.19712766     0.11210612    0.24024008  -0.04589256
## HoursPerWeek          0.17487149    -0.13383771   -0.18873469   0.09527073
## TotalHours            0.03957684    -0.02064435   -0.03570006   0.01070355
## APM                   0.63524838    -0.56739563   -0.72225317   0.40292811
## SelectByHotkeys       0.36005682    -0.27376720   -0.39001250   0.16696415
## AssignToHotkeys       0.45448065    -0.37792433   -0.46149571   0.09150689
## UniqueHotkeys         0.10287726    -0.19472632   -0.17277991   0.14181200
## MinimapAttacks        0.13774464    -0.21332731   -0.17146233   0.13371146
## MinimapRightClicks    0.14353711    -0.24454196   -0.21664279   0.32370624
## NumberOfPACs          1.00000000    -0.49140711   -0.81716179  -0.24257101
## GapBetweenPACs       -0.49140711     1.00000000    0.68048313  -0.31024215
## ActionLatency        -0.81716179     0.68048313    1.00000000  -0.10650457
## ActionsInPAC         -0.24257101    -0.31024215   -0.10650457   1.00000000
## TotalMapExplored      0.17048072    -0.09982669   -0.23132329   0.10436504
## WorkersMade           0.28220422    -0.23715710   -0.31369223   0.25373382
## UniqueUnitsMade      -0.06312477    -0.10869946   -0.04437974   0.18370352
## ComplexUnitsMade      0.19626756    -0.08292336   -0.19805174   0.05431177
## ComplexAbilityUsed    0.17769568    -0.09200361   -0.18968132   0.05340608
## MaxTimeStamp          0.22647439     0.05170493   -0.09126199  -0.19316272
## WorkersCreatedTotal   0.40200326    -0.19110823   -0.33994453   0.10600149
## UniqueHotkeysTot      0.35311232    -0.22374497   -0.30458277  -0.02222723
## ComplexUnitsMadeTot   0.18702246    -0.05707149   -0.18101351   0.04193400
##                     TotalMapExplored WorkersMade UniqueUnitsMade
## LeagueIndex               0.22771734  0.31045152    0.1156258141
## Age                      -0.09046704 -0.09229123   -0.0270185918
## HoursPerWeek              0.06116076  0.05067888    0.0313868195
## TotalHours                0.02636223  0.01482787   -0.0009934735
## APM                       0.26235319  0.37771883    0.1074639095
## SelectByHotkeys           0.21405422  0.16140363    0.1243302974
## AssignToHotkeys           0.16756499  0.19701165    0.0782151941
## UniqueHotkeys             0.37821400  0.13059590    0.3928935393
## MinimapAttacks            0.02142761  0.08228119   -0.0405750171
## MinimapRightClicks        0.11594280  0.21246447    0.0696206113
## NumberOfPACs              0.17048072  0.28220422   -0.0631247666
## GapBetweenPACs           -0.09982669 -0.23715710   -0.1086994618
## ActionLatency            -0.23132329 -0.31369223   -0.0443797409
## ActionsInPAC              0.10436504  0.25373382    0.1837035242
## TotalMapExplored          1.00000000  0.27366029    0.4748026933
## WorkersMade               0.27366029  1.00000000    0.2351954420
## UniqueUnitsMade           0.47480269  0.23519544    1.0000000000
## ComplexUnitsMade         -0.10106146  0.20326994   -0.1002967456
## ComplexAbilityUsed       -0.08488695  0.10495463   -0.1033550466
## MaxTimeStamp             -0.51107311 -0.13497723   -0.6465121649
## WorkersCreatedTotal      -0.08392605  0.75744388   -0.2028956790
## UniqueHotkeysTot          0.02699681  0.11131892   -0.0474386907
## ComplexUnitsMadeTot      -0.15064824  0.12914462   -0.1681055513
##                     ComplexUnitsMade ComplexAbilityUsed MaxTimeStamp
## LeagueIndex              0.171188405        0.156033303  0.007919558
## Age                     -0.080267568       -0.065602477  0.043063714
## HoursPerWeek             0.059285128        0.074713564  0.006680260
## TotalHours              -0.007153803       -0.006310863 -0.004555623
## APM                      0.161772514        0.141060512 -0.006738536
## SelectByHotkeys          0.065467292        0.063720393 -0.076384539
## AssignToHotkeys          0.169849787        0.169048336  0.014179858
## UniqueHotkeys           -0.110082288       -0.088265345 -0.421191908
## MinimapAttacks           0.052229853        0.042267276  0.100363603
## MinimapRightClicks       0.097993358        0.095658935  0.055295913
## NumberOfPACs             0.196267559        0.177695684  0.226474390
## GapBetweenPACs          -0.082923356       -0.092003610  0.051704926
## ActionLatency           -0.198051740       -0.189681317 -0.091261990
## ActionsInPAC             0.054311769        0.053406079 -0.193162721
## TotalMapExplored        -0.101061459       -0.084886946 -0.511073107
## WorkersMade              0.203269938        0.104954630 -0.134977226
## UniqueUnitsMade         -0.100296746       -0.103355047 -0.646512165
## ComplexUnitsMade         1.000000000        0.620551475  0.313762070
## ComplexAbilityUsed       0.620551475        1.000000000  0.268226968
## MaxTimeStamp             0.313762070        0.268226968  1.000000000
## WorkersCreatedTotal      0.402437476        0.297700140  0.469685149
## UniqueHotkeysTot         0.122350112        0.110003415  0.180428482
## ComplexUnitsMadeTot      0.937840929        0.626079031  0.435816249
##                     WorkersCreatedTotal UniqueHotkeysTot
## LeagueIndex                 0.279704822      0.322414631
## Age                        -0.057288202      0.015119375
## HoursPerWeek                0.051100098      0.070261968
## TotalHours                  0.009691078      0.009318248
## APM                         0.334293446      0.335985578
## SelectByHotkeys             0.092906933      0.274030256
## AssignToHotkeys             0.200573250      0.402555962
## UniqueHotkeys              -0.150297289      0.703174070
## MinimapAttacks              0.139949545      0.151028336
## MinimapRightClicks          0.232162687      0.124570093
## NumberOfPACs                0.402003257      0.353112320
## GapBetweenPACs             -0.191108228     -0.223744973
## ActionLatency              -0.339944530     -0.304582772
## ActionsInPAC                0.106001488     -0.022227233
## TotalMapExplored           -0.083926053      0.026996810
## WorkersMade                 0.757443878      0.111318923
## UniqueUnitsMade            -0.202895679     -0.047438691
## ComplexUnitsMade            0.402437476      0.122350112
## ComplexAbilityUsed          0.297700140      0.110003415
## MaxTimeStamp                0.469685149      0.180428482
## WorkersCreatedTotal         1.000000000      0.219339365
## UniqueHotkeysTot            0.219339365      1.000000000
## ComplexUnitsMadeTot         0.420127786      0.122796085
##                     ComplexUnitsMadeTot
## LeagueIndex                 0.156324642
## Age                        -0.069733538
## HoursPerWeek                0.056898464
## TotalHours                 -0.006459099
## APM                         0.139713657
## SelectByHotkeys             0.048238700
## AssignToHotkeys             0.153831117
## UniqueHotkeys              -0.143863720
## MinimapAttacks              0.062639107
## MinimapRightClicks          0.093137523
## NumberOfPACs                0.187022461
## GapBetweenPACs             -0.057071490
## ActionLatency              -0.181013507
## ActionsInPAC                0.041934003
## TotalMapExplored           -0.150648238
## WorkersMade                 0.129144625
## UniqueUnitsMade            -0.168105551
## ComplexUnitsMade            0.937840929
## ComplexAbilityUsed          0.626079031
## MaxTimeStamp                0.435816249
## WorkersCreatedTotal         0.420127786
## UniqueHotkeysTot            0.122796085
## ComplexUnitsMadeTot         1.000000000

At first glance, it looks like APM, PAC data, and hot key usage the strongest predictor of league placement as expected. A little surprisingly, age has little correlation to any of the variables. I’m most interested in the relationship between a player’s action variables (APM, PAC data, workers made, etc) and league placement.

## sc$LeagueIndex: 1
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   19.00   21.00   22.72   26.00   40.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 2
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   18.00   21.00   22.16   25.00   43.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 3
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   18.00   21.00   22.05   24.00   41.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 4
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   19.00   21.00   21.98   24.00   44.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 5
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   18.00   21.00   21.36   24.00   37.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 6
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   18.00   20.00   20.68   22.00   31.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 7
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   19.00   22.00   21.17   23.00   26.00 
## -------------------------------------------------------- 
## sc$LeagueIndex: 8
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##      NA      NA      NA     NaN      NA      NA      55

## sc$LeagueIndex: 1
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   22.06   43.32   54.05   59.54   72.50  172.95 
## -------------------------------------------------------- 
## sc$LeagueIndex: 2
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   24.66   57.21   71.68   74.78   89.21  179.62 
## -------------------------------------------------------- 
## sc$LeagueIndex: 3
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   29.82   68.14   85.96   89.97  104.85  226.66 
## -------------------------------------------------------- 
## sc$LeagueIndex: 4
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   38.03   80.74  103.81  105.85  123.73  249.02 
## -------------------------------------------------------- 
## sc$LeagueIndex: 5
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   49.74  102.84  126.01  131.52  152.58  372.64 
## -------------------------------------------------------- 
## sc$LeagueIndex: 6
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   65.37  125.13  152.19  158.68  187.19  389.83 
## -------------------------------------------------------- 
## sc$LeagueIndex: 7
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   115.8   168.1   185.3   189.6   212.4   298.8 
## -------------------------------------------------------- 
## sc$LeagueIndex: 8
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   146.4   222.2   274.3   267.3   313.3   375.9

I imagined age to have some factor in determining overall skill for a variety of reasons. From the plot, it looks like every league has similar age means and medians. However, there is a stark rise in APM as league placements go up. So as expected, APM is worth exploring further.

## sc$LeagueIndex: 1
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   51.11   77.88   93.33   95.40  107.82  173.56 
## -------------------------------------------------------- 
## sc$LeagueIndex: 2
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   40.01   68.48   79.24   81.27   90.18  176.37 
## -------------------------------------------------------- 
## sc$LeagueIndex: 3
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   36.58   62.46   72.10   73.70   83.07  150.30 
## -------------------------------------------------------- 
## sc$LeagueIndex: 4
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   35.14   55.52   62.77   64.79   72.24  129.85 
## -------------------------------------------------------- 
## sc$LeagueIndex: 5
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   30.76   48.10   54.99   56.09   62.72  103.38 
## -------------------------------------------------------- 
## sc$LeagueIndex: 6
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   24.63   41.83   47.96   48.95   55.33   88.32 
## -------------------------------------------------------- 
## sc$LeagueIndex: 7
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   29.99   34.91   39.73   40.34   45.73   52.31 
## -------------------------------------------------------- 
## sc$LeagueIndex: 8
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   24.09   31.28   35.41   35.39   38.58   54.56

Hey, that’s pretty neat. There’s about a 10 ms decrease in action latency when moving through leagues! The trend is also expected as the players who do more are better.

## sc$LeagueIndex: 1
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 2.005e-05 4.270e-05 4.527e-05 6.400e-05 2.411e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 2
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 2.070e-05 4.000e-05 4.494e-05 6.305e-05 1.404e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 3
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 2.460e-05 4.330e-05 4.943e-05 6.850e-05 3.376e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 4
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 2.950e-05 4.820e-05 5.257e-05 7.055e-05 2.350e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 5
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 3.810e-05 5.665e-05 6.351e-05 8.225e-05 2.935e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 6
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 4.920e-05 7.000e-05 7.445e-05 9.290e-05 2.819e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 7
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 2.960e-05 5.280e-05 7.870e-05 7.979e-05 1.025e-04 1.727e-04 
## -------------------------------------------------------- 
## sc$LeagueIndex: 8
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0000373 0.0000754 0.0001045 0.0001100 0.0001325 0.0003179

While it does look like better players are using their hot keys, comparing the plots of unique hot keys and assigning to hot keys doesn’t show you how hot keys are used. So we are still unsure of what better players are doing differently.

I thought complex units and workers made would favor higher level players as a general trend. The one problem I see with workers made is that it is highly dependent on the game rather than how good a player is. It looks the median values of workers made plateau after a certain point. This could just mean it’s one of the first skills learned by players. I figured only better players would use complex units because they’re difficult to control. There are definitely a variety of reasons why this seems to be untrue to a degree.

When looking at a player’s APM, workers created isn’t very telling. This further confirms my thoughts that workers created in a game is too dependent on context to help determine skill levels. The other plots give a little insight into what players are doing with their actions.

While it does make sense that players who play faster are switching screens more, seeing lower delays in actions and smaller gaps between switching screens gives us clues as to what better players are doing differently with their applied actions in game.

Some players don’t use minimap attacks at all. This might be a clue that it is an underused mechanic at the time of data collection.

Bivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. How did the feature(s) of interest vary with other features in
the dataset?

As expected, the better players do everything a little more than lower placed players do. The better players utlize more mechanical tools, play much faster, and switch to different parts of the map more. They utilize more hot keys, make more workers to some point, act faster, and possibly most importantly, act around the map more often.

When combing through APM usage, it looks like players with higher APM are constantly clicking around the map to utilize more actions and are constantly using their hot keys. Naturally, the faster they play, the less delay there is between actions. However, they are not right clicking onto the map very much with all that speed.

Did you observe any interesting relationships between the other features
(not the main feature(s) of interest)?

I was surprised by how little the correlation between hot key assignment and hot key usage was when plotted against APM. I would like to see how action latency, hot key usage, APM, and league placement all go together. I’m surprised at how complex units just drop off at the top tiers and have a spike in the mid tiers. I am guessing players are making complex units out of novelty.

What was the strongest relationship you found?

The league placement and APM are both strongly negatively correlated to action latency. All the PAC data correlate very well with league placement. It’s also no surprise that it correlates with APM naturally directly affects any PAC numbers.

Multivariate Plots Section

When looking at the gaps between actions, you can easily tell skill and speed go together. However, the number of unique hot keys starts look the same after 5 unique hot keys untilized.

When comparing league, number of PACs, and actions in PACs, there appears to be no real relation among these three variables. Better players don’t appear to do any more per PAC than worse players do. Hours per week is not very telling of how good someone is. I feel this may be a data collection error rather than actual truth.

There’s a clear relationship between time spent in one screen section and the number of PACs in a game. If anything, it’s interesting to see how small the difference is between good and bad players just looking at total milliseconds spent on one screen.

When faceted by unique hot keys, I think the most surprising is you see some very good players barely use any hot keys or none at all! I figured this was a pretty universal mechanic that only the beginners neglect. I’m even surprised to see platinum (LeagueIndex == 4) players neglect hot keys completely. However, it does look like once you get to 6 hot keys and on, you are efficiently able to do more and faster.

While it’s clear that better players simple play faster, they’re also using their actions to click on the minimap whether it be for attacking or exploring. As the league index rises, the graph shifts towards the right.

Again, if there’s anything surprising about these plots, it’s that there are masters level players with <= 100 APM.

While it’s clear that reaction speed and playing speed are good indicators of league index, hot key usage is not as telling. There is some relationship to players who are better utilize more hot keys, but their actions are spent doing other things. It does look like the majoirty of players use at least four hotkeys.

Multivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. Were there features that strengthened each other in terms of
looking at your feature(s) of interest?

Perhaps this isn’t really surprising, but speed plays a huge part in league placement, larger than the other mechanics like hot key usage. While APM naturally correlates with PAC data, comparing gaps between PACs, actions in PAC, and APM shows that faster players are constantly moving their screens rather than sending multiple actions per change in screen.

Were there any interesting or surprising interactions between features?

I’m surprised at how little hot key usage affects overall league placement and skill. Since most of the pros and I’m assuming most of the better players played StarCraft Brood War, constantly selecting, assigning, re-assigning, and using all your hot keys from 1 to 0 as well as F2, F3, F4, and F5 seems like a natural transition. This may be a clue about how StarCraft 2 is different mechanically from StarCraft Brood War.


Final Plots and Summary

Plot One: Speed Matters

Description One

No surprises here. Better players act faster and play faster. They also switch screens faster than worse players do. One way StarCraft 2 players gain an advantage is to attack multiple spots on the map to misdirect the opponents’ screens. Being able to switch back and forth faster is easily an advantage.

Plot Two: But Hot Keys, Not So Much

Description Two

While we can clearly tell playing speed matters, we could also tell from the previous plot that moving around the map matters too. In this plot, it shows that utilizing hot keys doesn’t matter as much. It looks like most players use at least two hot keys, possibly one for production and one for producing workers. Otherwise, there aren’t that many differences after 5 unique hot keys. The most surprising thing is that some very good, diamond and masters, players aren’t using hot keys at all.

Plot Three: What Else is All That APM Doing?

Description Three

As we move up in league index, we can see a shift towards the right of each plot indicating that better players utilize minimap attacks and minimap right clicks more. In addition, it follows the trend that the best players are utilizing more available game mechanics than worse players. Both grandmaster and pro level players are using far more minimap clicks whether it be for scouting or moving army while doing other things. This could be an indicator that using minimap actions is one of the last things players at the time learned.


Reflection

The StarCraft 2 replay dataset contained final game data on a variety of topics. Initially, I found the idea of a perception action cycle to be the most interesting because actions per minute data is regarded as noisy data. I did try to suss out what APM is being spent doing and whether there are stark differences in leagues. For instance, are there certain mechanics that some levels are using that lower ones aren’t?

The data also made it very clear that obviously in a game where you will never be able to do everything, how quickly you can act matters most. However, rather than mindlessly running through the game, it is more important to act quickly in various parts of the map. Being able to act swiftly on multiple screens is the mark of a great player.

While there are no certainties, it looks as if better players are also utilizing as many game mechanics as possible. They input actions into the minimap to explore or move armies for multi-pronged attacks, utilize at least four hot keys, and continuously make workers to proper saturation points. I was most surprised by how little players utilize hot keys based on what I see now.

The main issue I see with this data set is that it is collected from the beginnings of StarCraft 2 when players are still learning how to play. If I were to look at a newer dataset, I’d guess players utilize more hot keys, are even faster, and the differences in league will be less about speed but more about efficient use of actions. I’d also guess the minimum unique hot keys will go up from 4 to at least 7 when including the number row and the function row on keyboards.

The other big limitation I find with this data is that it’s a snapshot at the end of the game. I would have liked to examine APM ranges within the game, peak actions, when certain actions such as map exploration happens, and how everything contributes to whether a player won or lost a game.

To expand on exploration, I’d examine how time affects games next. Player bases always complain about how mid league placements (4, 5, 6) consistently have players pulling scummy tactics to win games. Those games are typically shorter than “proper” games and are frowned upon by the community.